108 PART 3 Getting Down and Dirty with Data
text because of the embedded “/”, rather than as numerical data. Instead,
create two separate variables and enter each number into the appropri-
ate variable.»
» When recording multiple types of measurements (such as days, weeks,
months, and years), use two columns to record the data (such as time and type).
In the first column, store the value of the variable, and in the second column,
store a code to indicate the type (such as 1 = days, 2 = weeks, 3 = months, and
4 = years). As an example, “3 weeks” would be entered as a 3 in the time column
and 2 in the type column.
Missing numerical data requires a little more thought than missing categorical
data. Some researchers use 99 (or 999, or 9999) to indicate a missing value in
categorical data, but this approach should not be used for numeric data (because
the statistical program will see these values as actual measured values, and not
codes for missing data). The simplest technique for indicating missing numerical
data is to leave it blank. Most software treats blank cells as missing data in a cal-
culation, but this changes depending on the software, so it’s important to confirm
missing values handling in your analysis.
Entering date and time data
Now we’re going to tell you something that sounds like we’re contradicting the
advice we just gave you (but, of course, we’re not!). Most statistical software
(including Microsoft Excel) can represent dates and times as a single variable (an
“instant” on a continuous timeline), so take advantage of that if you can. In Excel,
you can enter the date and time as one variable (for example, 07/15/2020 08:23),
not as a separate date variable and a time variable. This method is especially use-
ful when dealing with events that take place over a short time interval (like events
occurring during a surgical procedure). It is important to collect all potential start
and end dates so any duration during the study can be calculated.
Some programs may store a date and time as a Julian Date, whose zero occurred at
noon, Greenwich Mean Time, on Jan. 1, 4713 BC. (Nothing happened on that date;
it’s purely a numerical convenience.)
What if you don’t know the day of the month? This happens a lot with medical
history items; a participant may say, “I got the flu in September 2021.” Most
software (including Excel) insists that a date variable be a complete date, and
won’t accept just a month and a year. In this case, a business rule is created to set
the day (as either the 1st, 15th, or last day of the month). Similarly, if both the
month and day are missing, you can set up a business rule to estimate both.